Solving the Linear Bellman Equation via Dual Kernel Embeddings
نویسندگان
چکیده
We introduce a data-efficient approach for solving the linear Bellman equation, which corresponds to a class of Markov decision processes (MDPs) and stochastic optimal control (SOC) problems. We show that this class of control problem can be cast as a stochastic composition optimization problem, which can be further reformulated as a saddle point problem and solved via dual kernel embeddings [1]. Our method is model-free and using only one sample per state transition from stochastic dynamical systems. Different from related work such as Z-learning [2, 3] based on temporal-difference learning [4], our method is an online algorithm following the true stochastic gradient. Numerical results are provided, showing that our method outperforms the Z-learning algorithm.
منابع مشابه
Wavelet-based numerical method for solving fractional integro-differential equation with a weakly singular kernel
This paper describes and compares application of wavelet basis and Block-Pulse functions (BPFs) for solving fractional integro-differential equation (FIDE) with a weakly singular kernel. First, a collocation method based on Haar wavelets (HW), Legendre wavelet (LW), Chebyshev wavelets (CHW), second kind Chebyshev wavelets (SKCHW), Cos and Sin wavelets (CASW) and BPFs are presented f...
متن کاملHilbert Space Embeddings of POMDPs
A nonparametric approach for policy learning for POMDPs is proposed. The approach represents distributions over the states, observations, and actions as embeddings in feature spaces, which are reproducing kernel Hilbert spaces. Distributions over states given the observations are obtained by applying the kernel Bayes’ rule to these distribution embeddings. Policies and value functions are defin...
متن کاملThe solving linear one-dimemsional Volterra integral equations of the second kind in reproducing kernel space
In this paper, to solve a linear one-dimensional Volterra integral equation of the second kind. For this purpose using the equation form, we have defined a linear transformation and by using it's conjugate and reproducing kernel functions, we obtain a basis for the functions space.Then we obtain the solution of integral equation in terms of the basis functions. The examples presented in this ...
متن کاملKernel-Based Reinforcement Learning Using Bellman Residual Elimination
This paper presents a class of new approximate policy iteration algorithms for solving infinite-horizon, discounted Markov decision processes (MDPs) for which a model of the system is available. The algorithms are similar in spirit to Bellman residual minimization methods. However, by exploiting kernel-based regression techniques with nondegenerate kernel functions as the underlying cost-to-go ...
متن کاملAn Interior Point Algorithm for Solving Convex Quadratic Semidefinite Optimization Problems Using a New Kernel Function
In this paper, we consider convex quadratic semidefinite optimization problems and provide a primal-dual Interior Point Method (IPM) based on a new kernel function with a trigonometric barrier term. Iteration complexity of the algorithm is analyzed using some easy to check and mild conditions. Although our proposed kernel function is neither a Self-Regular (SR) fun...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2017